设计在边缘硬件上运行的深神经网络(DNN)仍然是一个挑战。社区已经采用了标准设计来促进神经网络模型的部署。但是,并不是很强调适应网络拓扑以适合硬件约束。在本文中,我们适应了移动硬件平台MobilenetV2的最广泛使用的架构之一,并研究了更改其拓扑结构并应用后培训后量化的影响。我们讨论了改编和模型在嵌入式硬件平台上进行面部检测的影响。
translated by 谷歌翻译
高分辨率图像和详尽的局部注释成本的过高成本阻碍了数字病理学的进展。用于对病理图像进行分类的常用范式是基于贴片的处理,该处理通常结合了多个实例学习(MIL)以汇总局部补丁级表示,从而得出图像级预测。尽管如此,诊断相关的区域只能占整个组织的一小部分,而当前基于MIL的方法通常会均匀地处理图像,从而丢弃相互作用的相互作用。为了减轻这些问题,我们提出了Scorenet,Scorenet是一种新的有效的变压器,利用可区分的建议阶段来提取区分图像区域并相应地专用计算资源。提出的变压器利用一些动态推荐的高分辨率区域的本地和全球关注,以有效的计算成本。我们通过利用图像的语义分布来指导数据混合并产生连贯的样品标签对,进一步介绍了一种新型的混合数据启发,即SCOREX。 SCOREMIX令人尴尬地简单,并减轻了先前的增强的陷阱,该增强性的陷阱假设了统一的语义分布,并冒着标签样品的风险。对血久毒素和曙红(H&E)的三个乳腺癌组织学数据集(H&E)的三个乳腺癌组织学数据集(H&E)的彻底实验和消融研究验证了我们的方法优于先前的艺术,包括基于变压器的肿瘤区域(TORIS)分类的模型。与其他混合增强变体相比,配备了拟议的得分增强的Scorenet表现出更好的概括能力,并实现了新的最先进的结果(SOTA)结果,仅50%的数据。最后,Scorenet产生了高疗效,并且胜过SOTA有效变压器,即TransPath和SwintransFormer。
translated by 谷歌翻译
深度学习方法已成为重建MR重建的最新采样的状态。特别是对于地面真理不可行或不可能的情况,要获取完全采样的数据,重建的自我监督的机器学习方法正在越来越多地使用。但是,在验证此类方法及其普遍性的验证中的潜在问题仍然没有得到充实的态度。在本文中,我们研究了自制算法验证未采样MR图像的重要方面:对前瞻性重建的定量评估,前瞻性和回顾性重建之间的潜在差异,常用的定量衡量标准的适用性和普遍性。研究了两种基于自我监督的denoising和先验的深层图像的自我监督算法。将这些方法与使用体内和幻影数据的最小二乘拟合以及压缩感测重建进行比较。它们的推广性通过前瞻性采样的数据与培训不同的数据进行了测试。我们表明,相对于回顾性重建/地面真理,前瞻性重建可能表现出严重的失真。此外,与感知度量相比,与像素定量指标的定量指标可能无法准确捕获感知质量的差异。此外,所有方法均显示出泛化的潜力。然而,与其他变化相比,概括性的影响更大。我们进一步表明,无参考图像指标与人类对图像质量的评级很好地对应,以研究概括性。最后,我们证明了经过调整的压缩感测重建和学习的DeNoising在所有数据上都相似地执行。
translated by 谷歌翻译
Deep neural networks may easily memorize noisy labels present in real-world data, which degrades their ability to generalize. It is therefore important to track and evaluate the robustness of models against noisy label memorization. We propose a metric, called susceptibility, to gauge such memorization for neural networks. Susceptibility is simple and easy to compute during training. Moreover, it does not require access to ground-truth labels and it only uses unlabeled data. We empirically show the effectiveness of our metric in tracking memorization on various architectures and datasets and provide theoretical insights into the design of the susceptibility metric. Finally, we show through extensive experiments on datasets with synthetic and real-world label noise that one can utilize susceptibility and the overall training accuracy to distinguish models that maintain a low memorization on the training set and generalize well to unseen clean data.
translated by 谷歌翻译
Multi-lingual language models (LM), such as mBERT, XLM-R, mT5, mBART, have been remarkably successful in enabling natural language tasks in low-resource languages through cross-lingual transfer from high-resource ones. In this work, we try to better understand how such models, specifically mT5, transfer *any* linguistic and semantic knowledge across languages, even though no explicit cross-lingual signals are provided during pre-training. Rather, only unannotated texts from each language are presented to the model separately and independently of one another, and the model appears to implicitly learn cross-lingual connections. This raises several questions that motivate our study, such as: Are the cross-lingual connections between every language pair equally strong? What properties of source and target language impact the strength of cross-lingual transfer? Can we quantify the impact of those properties on the cross-lingual transfer? In our investigation, we analyze a pre-trained mT5 to discover the attributes of cross-lingual connections learned by the model. Through a statistical interpretation framework over 90 language pairs across three tasks, we show that transfer performance can be modeled by a few linguistic and data-derived features. These observations enable us to interpret cross-lingual understanding of the mT5 model. Through these observations, one can favorably choose the best source language for a task, and can anticipate its training data demands. A key finding of this work is that similarity of syntax, morphology and phonology are good predictors of cross-lingual transfer, significantly more than just the lexical similarity of languages. For a given language, we are able to predict zero-shot performance, that increases on a logarithmic scale with the number of few-shot target language data points.
translated by 谷歌翻译
Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. However, practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance. By focusing on two-layer ReLU networks, which can be fully characterized by measures over their feature space, we explore how the implicit bias induced by gradient-based optimization could partly explain the above phenomenon. We provide theoretical evidence that the regression formulation yields a measure whose support can differ greatly from that for classification, in the case of one-dimensional data. Our proposed optimal supports correspond directly to the features learned by the input layer of the network. The different nature of these supports sheds light on possible optimization difficulties the square loss could encounter during training, and we present empirical results illustrating this phenomenon.
translated by 谷歌翻译
基于用法的保险已成为车辆保险的新标准;因此,找到使用保险人的驾驶数据的有效方法很重要。在车辆的行程摘要中应用异常检测,我们开发了一种方法,允许为每辆车辆得出“常规”和“特殊性”异常轮廓。为此,使用每辆车辆进行的每次旅行的异常检测算法来计算常规和特殊性异常得分。与相关车辆进行的其他旅行相比,前者测量了旅行的异常程度,而后者则与任何车辆进行的旅行相比,衡量了其异常程度。所得的异常得分向量用作常规和特殊性曲线。然后从这些配置文件中提取功能,我们为其研究索赔分类框架中的预测能力。使用真实数据,我们发现从车辆的特殊性概况提取的功能改善了分类。
translated by 谷歌翻译
本文研究了鳞状高斯分布(NC-MSG)的非中心混合物的统计模型。使用与此分布相关的Fisher-Rao信息几何形状,我们得出了Riemannian梯度下降算法。该算法用于两个最小化问题。第一个是最小化正规化对数可能性(NLL)。后者使白色高斯分布与NC-MSG之间的权衡。给出了正则化的条件,以便在没有样本上的假设的情况下保证了该问题的最低限度。然后,得出了两个NC-MSG之间的Kullback-Leibler(KL)差异。这种差异使我们能够定义一个最小化问题,以计算几个NC-MSG的质量中心。提出的Riemannian梯度下降算法被利用以解决第二个最小化问题。数值实验表明了这两个问题的良好性能和riemannian梯度下降的速度。最后,实施了最接近的质心分类器,利用KL Divergence及其相关的质量中心。该分类器应用于大型数据集Breizhcrops,显示出良好的精度以及对测试集的刚性转换的稳健性。
translated by 谷歌翻译
网络威胁情报(CTI)共享是减少攻击者和捍卫者之间信息不对称的重要活动。但是,由于数据共享和机密性之间的紧张关系,这项活动带来了挑战,这导致信息保留通常会导致自由骑士问题。因此,共享的信息仅代表冰山一角。当前的文献假设访问包含所有信息的集中数据库,但是由于上述张力,这并不总是可行的。这会导致不平衡或不完整的数据集,需要使用技术扩展它们。我们展示了这些技术如何导致结果和误导性能期望。我们提出了一个新颖的框架,用于从分布式数据中提取有关事件,漏洞和妥协指标的分布式数据,并与恶意软件信息共享平台(MISP)一起证明其在几种实际情况下的使用。提出和讨论了CTI共享的政策影响。拟议的系统依赖于隐私增强技术和联合处理的有效组合。这使组织能够控制其CTI,并最大程度地减少暴露或泄漏的风险,同时为共享的好处,更准确和代表性的结果以及更有效的预测性和预防性防御能力。
translated by 谷歌翻译
对于具有客户服务的公司,其对话数据中的映射意图对于基于自然语言理解(NLU)构建应用程序至关重要。但是,尚无既定的自动化技术来收集嘈杂的在线聊天或语音成绩单中的意图。简单的聚类方法不适合意图对话。为了解决这项意图景观任务,我们提出了一条无监督的管道,从现实世界对话中提取意图和分类。我们的管道地雷意向跨候选者具有提取性问题的电气模型,并利用句子的嵌入来应用低级密度聚类,然后是顶级分层聚类。我们的结果表明,在Squad2数据集上微调的Electra大型模型的概括能力以了解对话。有了正确的提示问题,该模型实现了对意图的语言验证率超过85%。我们此外,从多道数据集中重建了五个域的意图方案,平均召回率为94.3%。
translated by 谷歌翻译